Grid Checkpointing Architecture - a revised proposal

نویسندگان

  • Gracjan Jankowski
  • Radoslaw Januszewski
  • Rafal Mikolajczak
  • Jozsef Kovacs
چکیده

Contemporary Grid environments are featured by an increasingly growing virtualization and distribution of resources. Such situations impose greater demands on load-balancing and fault-tolerant capabilities. The checkpointrestart mechanism seems to be the most intuitive tool that can fulfill the specific requirements. However, as there is still a lack of widely available, production-grade checkpoint-restart tools, the higher level checkpoint-restart services are not well developed yet. One of the goals of the CoreGRID Network of Excellence is to define the high-level checkpoint-restart Grid Service and to locate it among other Grid Services. We aim to define both the abstract model of that service and the lower layer interface that will allow the service to cooperate with diverse existing and future checkpoint-restart tools. The paper is the first step on the road to this goal. It includes the overall sketch of the architecture of the considered service and its connection with the actual checkpoint-restart tools.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid

Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...

متن کامل

An Architecture for Checkpointing and Migration of Distributed Components on the Grid

Sriram Krishnan AN ARCHITECTURE FOR CHECKPOINTING AND MIGRATION OF DISTRIBUTED COMPONENTS ON THE GRID A computational Grid is a set of hardware and software resources that provide seamless, dependable, and pervasive access to high-end computational capabilities. The Grid differs from other computational resources such as traditional supercomputers and clusters by the following key features: (1)...

متن کامل

The Architecture of the XtreemOS Grid Checkpointing Service

The EU-funded XtreemOS project implements a grid operating system (OS) transparently exploiting distributed resources through the SAGA and POSIX interfaces. XtreemOS uses an integrated grid checkpointing service (XtreemGCP) for implementing migration and fault tolerance. Checkpointing and restarting applications in a grid requires saving and restoring applications in a distributed heterogeneous...

متن کامل

Integrated Process Management in a Grid Checkpointing Environment

For many businesses, the ability to manage dynamic distributed environments has become a key success factor. Joint industry and/or academic cooperations exploit resources spawning multiple administrative domains with millions of nodes and thousands of users. In order to run the overall business effectively Grid technologies can be applied. The EU-funded XtreemOS project implements a grid operat...

متن کامل

DEE: A Distributed Fault Tolerant Workflow Enactment Engine for Grid Computing

It is a large and complex task to design and implement a workflow management system that supports scalable executions of largescale scientific workflows in distributed and unstable Grid environments. In this paper we describe the Distributed workflow Enactment Engine (DEE) of the ASKALON application development environment for Grid computing. DEE proposes a de-centralized architecture that simp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006